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Abstract 

Although some information-theoretic measures of uncertainty or granularity have been proposed in rough set theory, 
these measures are only dependent on the underlying partition and the cardinality of the universe, independent of the 
lower and upper approximations. It seems somewhat unreasonable since the basic idea of rough set theory aims at 
describing vague concepts by the lower and upper approximations. In this paper, we thus define new information- 
theoretic entropy and co-entropy functions associated to the partition and the approximations to measure the uncer- 
tainty and granularity of an approximation space. After introducing the novel notions of entropy and co-entropy, we 
then examine their properties. In particular, we discuss the relationship of co-entropies between different universes. 
The theoretical development is accompanied by illustrative numerical examples. 
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1. Introduction 



To handle inexact, uncertain or vague knowledge in some information systems, Pawlak developed rough set theory 
in the early 1980s ifTil [Tsll . Since then we have witnessed a systematic, world-wide growth of interest in rough set 
theory and its applications in a number of fields, such as granular computing, data mining, decision analysis, pattern 
recognition, and approximate reasoning | I2I PtIJisI Isoi l34l 35 ] . 

The starting point of rough set theory in [14, 15] is the idea that elements of a universe having the same description 
are indiscernible with respect to the available information. The indiscernibility was described by an equivalence 
relation in the way that two elements are related by the relation if and only if they are indiscernible from each other. 
As is well known, any equivalence relation defined on a universe U determines a partition of U into a collection of 
equivalence classes (blocks): each class contains all and only the elements that are mutually equivalent among them. 
Any partition n of U represents a piece of knowledge about the elements of U forming a classification and so any 
equivalence class induced by n is interpreted as a granule of knowledge contained in (or supported by) n. 

According to Pawlak's terminology expressed in [1I61I . any subset X of the universe U is called a concept in 
U. If the concept X is a union of equivalence classes from n, then X is precise in tt, otherwise X is vague. The 
basic idea of rough set theory consists in replacing vague concepts with a pair of precise concepts, its lower and 
upper approximations I,l6i1 . and thus, a basic problem in this framework is to reason about the accessible granules of 
knowledge. To this end, various knowledge granulations (also, information granulations or granulation meas ures), as 
an average measure of knowledge granules, have been proposed and addressed in 1 li bl Isl l 1 li 1 1 3ll2 ll \23l 124 l25i uoi 
I28I I32II . Amon g th em, there are several information-theoretic measures of uncertainty or granularity for rough sets 



IILI13LI21L I23LI25II . which are based upon the important notion of entropy introduced by Shannon ll22ll : 
for more details, we refer the reader to the excellent survey papers iB EtIi . 

It is worth noting that the information-theoretic measures mentioned above are only dependent on the sizes of 
equivalence classes (essentially, the underlying partition) and the cardinality of the universe, independent of the lower 
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and upper approximation operators. For example, in [@ [Ijl 23, 2^ the information entropy H(n) of the partition 
TT - {U\, U2, ■ • ■ , Uk} is defined as 



Zrii rii 
-log-, 
n n 

i=\ 



where n-, is the cardinality of f/, and n - Y!i=\ As a result, it often yields that some partitions like {{1}, {2}) and 
{{1, 2}, {3,4}) have the same entropy (or co-entropy). This seems somewhat unreasonable since the basic idea of rough 
set theory aims at describing vague concepts by the lower and upper approximations. In other words, the result of this 
description relies on both the partition and the approximations. In light of this, we should pay more attention to the 
lower and upper approximation operators. 

The previous observation motivates us to propose another information-theoretic entropy function to measure the 
uncertainty associated to the partition and the approximation operators in this paper More concretely, given a universe 
U with n elements and a partition n of U, we take count of the subsets of U described by every pair of lower and 
upper approximations. Assume that r,-, 1 < / < m, is the number of subsets described by the rough set approximation 
(Aj, AJ) and every subset of U appears with the same probability. It follows that the rough set approximation (A,, AJ) 
appears with the accumulative probability r,/2" since the amount of all subsets of U is precisely 2". In this way, we 
obtain a probability distribution 

^ ' \2« 2" 2"/ 

It gives rise to an information entropy, say "Hin), according to Shannon's information theory On the other hand, 
we can get by the probability distribution a co-entropy Q{n). It turns out that ^{n) + Q(n) — n. After exploring some 
properties of the entropy and co-entropy, we discuss the relationships of co-entropies between different universes. 
Roughly speaking, the co-entropy monotonically increases when the partition becomes coarser For example, the 
co-entropy of {{1,2}, {3,4}} is greater than that of {{!}, {2}}. 

The remainder of the paper is structured as follows. In Section 2, we briefly review some basics of Pawlak's 
rough set theory and the information-theoretic measures of uncertainty and granularity for rough sets in the literature. 
Section 3 is devoted to our novel notions of entropy and co-entropy and their properties. We address the relationship 
of co-entropies between different universes in Section 4 and conclude the paper in Section 5 with a brief discussion 
on the future research. 



2. Preliminaries 

This section consists of two subsections. We briefly recall the definition of Pawlak's rough sets in the first sub- 
section and then review two information-theoretic measures of uncertainty and granularity in rough set theory in the 
second subsection. 

2.1. Rough sets 

We start by recalling some basic notions in Pawlak's rough set theory 



Let f/ be a finite and nonempty universal set, and let 7? c f/ x f/ be an equivalence relation on U. Denote by UIR 
the set of all equivalence classes induced by R. Such equivalence classes are also called elementary sets; every union 
(not necessarily nonempty) of elementary sets is called a definable set. 

For any X c U, one can characterize X by a pair of lower and upper approximations. The lower approximation 
app X of X is defined as the greatest definable set contained in X, while the upper approximation appj^X of X is 

R 

defined as the least definable set containing X. Formally, 

app^X = U{C 6 UIR \ CQX] and '^rX = U{C E UjR \Cr\X + %}. 

The pair [app^ X, app^X^ is referred to as the rough set approximation of X. It follows immediately from definition 
that app^X Q X c app^X for any X c U. 

The ordered pair {U,R} is said to be an approximation space. A rough set in {U,R} is the family of all subsets of 
U having the same lower and upper approximations. Thus, the general notion of rough set can be simply identified 
with the rough approximation of any given set. 
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Recall that a partition of f/ is a collection of nonempty subsets of U such that every element x of C/ is in exactly 
one of these subsets; such subsets making up the partition are called blocks. We write Yl(U) for the set of all partitions 
of U and S^iU) for the power set of U. It is well-known that the notions of partition and equivalence relation are 
essentially equivalent, that is, for any equivalence relation R on U, the set f///? is a partition of U, and conversely, from 
any partition n of U, one can define an equivalence relation R,, on U such that UIR„ - n in the obvious way. Thus, 
we sometimes say that the ordered pair {U, n) is an approximation space and write app X and app^^X for app X and 

appn X, respectively. More generally, we will use equivalence relation and partition indiscriminately. 

If a universe U has more than one element, it is always possible to introduce at least two canonical partitions: One 
is the trivial partition, denoted by h, consisting of a unique block, and the other is the discrete partition, denoted by n, 
consisting of all singletons from U. Formally, 

TT = {U] and TT = {{x] \ xeU]. 

We now define a partial order on Yl(Uy. For any tt, cr e n(t/), cr < n if and only if for any Ceo-, there exists 
Den such that C Q D. For instance, n <n <n for any ji e Yii U). We say that cr is finer than n and that n is coarser 
than cr if cr < n. When cr < n, that is, cr < tt and cr + K,v^t say that cr is strictly finer than n and that n is strictly 
coarser than cr. Informally, this means that cr is a further fragmentation of n. 

2.2. Information-theoretic measures 

In this subsection, we review two information-theoretic measures associated with rough sets in the literature. 
These measures are concerned with the uncertainty or granularity of knowledge provided by a partition. 

In [T3, 23, 2^, Shannon entropy [22] has been used as a measure of information for rough set theory as 



follows. For subsequent need, we fix a notational convention: Throughout the paper, all logarithms are to base 2 
unless otherwise specified. 

Definition 2.1 (||6, 13, 23l l26ll \ Let {U,n) be an approximation space, where the partition n consists of blocks Ui, 



I < i < k, each having cardinality n,-. The information entropy H(7t) of partition n is defined by 

k k 

H(n) = - — - log — , where n = ^ n,-. (1) 
n n ^ 

1=1 1=1 

When TV = ft, the entropy function H achieves the minimum value 0, and when n = ft, it achieves the maximum 
value log n. Moreover, it has been shown in 1E3I1 that for any two partitions n and cr of U, if cr < n, then //(cr) > H(n). 
The equation ([T]i can be rewritten as follows: 

k 

//(7r) = log«-y ^log«,. (2) 
n 

1=1 

Recall that the Hartley measure ^ of uncertainty for a finite set X is 

H(X)^log\X\, 

where denotes the cardinality of the set X. It measures the amount of uncertainty associated with a finite set of 
possible alternatives, the nonspecificity inherent in the set. 

The first term log n (i.e., log \U\) in Eq. (|2| is exactly the Hartley measure of U, which is a constant independent of 
any partition. The second term of the equation is basically an expectation of granularity with respect to all blocks in 
a partition. This quantity has been used by Yao to measure the granularity of a partition in ll26l and has been defined 



by Liang and Shi as the rough entropy of knowledge in an approximation space in 111 111 . This quantity has also been 
referred to as co-entropy by some scholars (see, for example, l^-Sl)- 

Definition 2.2 (||2>[3, H, 2^). Let {U,n) be an approximation space, where the partition n consists of blocks Ui, 



I < i < k, each having cardinality n,-. The co-entropy G(n) of partition n is defined by 

k k 

G(n) - ^ — log where n - ^ (3) 



!=1 
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It follows immediately from defiiiition that 

Hi7r) + G(7T) = log«. 

Contrary to the uncertainty measure H, the co-entropy function G achieves the maximum value log n when n = it and 



the minimum value when tt = tt; moreover, it has been known 111 111 that for any two partitions n and cr of f/, if cr < r, 
then G(cr) < G(7t). 

As argued in |2, 3], the entropy H(7t) can be interpreted as the uncertainty measure of the partition n, while the 
co-entropy Gin) can be regarded as the granularity measure of n. In i2l[| . Sen and Pal introduced two other entropy 
measures for crisp sets and fuzzy sets with (crisp or fuzzy) equivalence relations or (crisp or fuzzy) tolerance relations, 
which are based upon the roughness measures of X and of the complement of X in the universe and have been used 
to analyze the grayness and spatial ambiguities in images. Under the same name, there are some different concepts of 
entropy in the literature of rough set theory (see, for example, ioi Eoll ). 

3. A novel pair of entropy and co-entropy 

In this section, we first introduce a novel entropy and the corresponding co-entropy and then explore their proper- 
ties. 

Let us begin with some notations. Throughout this section, we write (U, n) for an approximation space and assume 
that \U\ - n. Given a (t/, tt), we use ^{U,n) to denote the set of rough set approximations of all subsets of U . More 
formally, we set 

Jl{U, n) = ^app X, ^„X) \ XQUY (4) 

It follows from Eq. ^ that J?l(f/,7r) has at least two elements: (0,0) and (U, U). If n = 1, then J[{U,n) exactly 
consists of the two elements; if n > 1 and k - n, then J?l(t/, tt) contains one more element (0, U); for any « > 1, if 
71 - n, then we see that J?l(t/, n) - {{X, X) \ X Q U], which consists of 2" elements. Note that the set ^{U, n) is not a 
multiset, that is, the same element cannot appear more than once in JiiU, n). In general, we have that \^{U, n)\ < 2" 
since the subset X of C/ in Eq. (HJi has only 2" alternatives. 

For simplicity, we use m to stand for \^(U,7t)\. For any (A,-, AJ) € J\(U,n), 1 < i < m, we set 

= |x c f/ 1 (appX, ^„X) = (A;, A;)| and |^,| = n. (5) 

In other words, r,- is the number of subsets of U that have the rough set approximation (A,,Ap. It turns out that 
{^1,^2, ■ ■ ■, ^m] gives rise to a partition of 3^{U). Therefore, we get by Eq. (|4]l that 

m 

(=1 

To illustrate the above concepts, let us see an example. 

Example 3.1. Consider U — {1,2,3,4] and n — {{1,2}, {3,4}). In this case, U has 16 subsets. For each subset X of 
U, we compute the rough set approximation ofX; the results are listed in Table\l] 
Hence, we see that 

y[(f/,7r) = {(0,0),(0,{l,2)),(0,{3,4)),({l,2},{l,2}),(0, U) ,({3,4}, {3,4}) ,({l,2}, U) ,({3,4}, U) ,(U, U)] . 
As an example, let us calculate r^- By definition, 

[XQU\ {appX,^,x) = (0, {1,2})} 

= |{{1!,{2}!I 
= 2. 

This is exactly the number of subsets of U that have the rough set approximation (0, {1,2}), which can be counted 
from the table. In light of this, we may get Table\2\by rearranging Table\l] It follows immediately from Table\2\that 
ri — r4 — r(, — r<) — 1, r2 — r^ — rj — r^ — 2, and r^ — 4. 



ri 
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Table 1 : The subsets and corresponding rough set approximations in Example l3.1l 



subset approximation 


subset approximation 


subset approximation 


subset approximation 


(0, 0) 


{1} (0,{1,2}) 


{2} (0,{1,2}) 


{3} (0,{3,4}) 


{4} (0,{3,4}) 


{1,2} ({1,2},{1,2}) 


{1,3} (0,f/) 


{1,4} (0,f/) 


{2,3} (0,f/) 


{2,4} (0,t/) 


{3,4} ({3,4}, {3,4}) 


{1,2,3} ({l,2},t/) 


{1,2,4} ({l,2],U) 


{1,3,4} ({3,4}, f/) 


{2,3,4} ({3,4}, t/) 


U ( U, U) 



Table 2: The rough set approximations and coiTesponding subsets in Example l3.1l 



approximation subsets 


approximation subsets 


approximation subsets 


(0,0) 


(0,{1,2}) {1},{2} 


(0,{3,4}) {3}, {4} 


({1,2},{1,2}) {1,2} 


(0,t/) {1,3}, {1,4}, {2, 3}, {2,4} 


({3,4}, {3,4}) {3,4} 


({1,2], U) {1,2,3},{1,2,4} 


({3,4], U) {1,3,4},{2,3,4} 


(U, U) U 



Because we are concerned with the partition granulation of (U, n) with respect to the approximation operators app 
and app, we may assume that every subset of U appears with the same probability 1 /2". As a result, the rough set 
approximation (A,, AJ) appears with the accumulative probability r,72" and we thus obtain a probability distribution 



^ ' \2" 2" 2") 



(6) 



According to Shannon's information theory [22], the Shannon entropy function of the probability distribution P{n) 
is defined as follows. 

Definition 3.1. Keep the notations as above. The information entropy "T^in) of {U, n) (with respect to the approxima- 
tion operators app and app) is defined by 



w = w-)) = -£ Jiog^. 



(7) 



In the above definition, for simplicity we have used the notation "Hin) instead of "//(t/, n). Following the explana- 
tion of Shannon entropy in information theory, the quantity "/^(tt) measures the uncertainty associated to the partition 
TT with respect to the approximation operators app and app. For instance, the probability distribution corresponding 
to the partition tt = {{1, 2}, {3,4}} in Example l3.1l is 



Pin). 



1 2 2 1 4 1 2 2 1 
24' 24' 24' 24' 24' 24' 24' 24' 24 



It follows from Definition ITT] that 



w = Slog 5 



24 24 

1 1 2 2 2 2 1 1 4 4 

—7 log — H log — H log — H log — H log — 

24 '' 24 2 2 2 2 2 2 2 2 
11 22 22 11 

+ 54 ^4 + ^4 ^4 + ^4 log 5? + ^ log ^ 
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Similar to other entropy functions in rough set theory, the information entropy in Definition l3.1l has the following 
properties. 

Theorem 3.1. 

(1) For any tt, cr e 11(17), if cr < n, then "Hicr) > 'Hin). 

(2) The entropy function 'H reaches the maximum value nfor the finest partition n. 

(3) The entropy function H reaches the minimum value n — log(2" - 2) for the coarsest partition n. 



Proof. (1) Without loss of generality, we may assume that n - {U\, U2, ■ ■ ■ , Uu} and cr - [Ua, Ut, U2, 
Ua^ Ub - Ui. Suppose that \ Jl(U,n)\ - m and for any (A,-,Ap € JI(U,k), 1 < i <m,we write r,- for 



■ ,Uii}, where 



[XQU\ {app X,7Ipp„x) = (A,, a;) 



Based on the partition n, the power set ^(U) is partitioned into m blocks and the i-th block has the cardinality r,. 
Similarly, we denote by sj the cardinality of the j-th block of £P{U) associated to the partition cr. We now consider 
the elements of J[{U,cr). For any {Bj,B'j) e J\(U,o-), there are two possibilities: One is that (Bj,B'j) e J\{U,n), say 
{Bi,B'^ - (A; ,A; ) for some / ,. In this case, it is clear that s, = r,- . The other case is that {Bi,B') e ^{U,cr)\^(U,n), 

■' J ] Ij ■' J ] ■'J 

where the symbol A\B denotes the set of all elements which are members of A but not members of B. It follows that 
for some ij, 

\x c u\ {app X,^^x) = {Bj,B])^ C {x c u\{appX,^,x) = (A,,,Ap}, 

because the partition cr is strictly finer than n. In this case, we also see that the /j-th block provided by n is partitioned 
into smaller blocks and thus r,v - 2y Sj > sj. In summary, we get that either r,- - sj or r, - Yjj s,^ > Si ., and moreover, 
the latter case must exist as cr < n. We thus assume that r/ - s;, for / e /] and r,- = YjJ ^ij > for / € I2, where h ^ 
and /] U /2 - { 1 , 2 . . . , m). Let us compare 'H(cr) with Hin). 



-V —1 — 

/ J 2" 2" 



/ i 2" 2" / 2" * 

/G/2 

2" ^ 



2" 



y^iog^ 

/ i On On 



.y^iog^ 



/e/2 



2" 

\ r / 



y^iog^ 

/ i 2p 2" 

/G/l 

y^iog^ 

/ J 2" 2" 



V 

^z n^' -« 









f \ 


log 




- n 





_y fiiiog!^ 

/ i 2" 2" 

ie/i 



-y 

2" 4-' 



y y - log- 



namely, "KCcr) > 'H(7t). Therefore, the clause (1) holds. 
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(2) It follows from (1) that reaches the maximum value when n - n. In this case, we get by definition that 
This proves (2). 

(3) By (1), we see that "TY reaches the minimum value when n = n. In this case, the empty subset of f/ has the 
rough set approximation (0, 0) and U itself has the rough set approximation {U, U). For any proper subset of U, if any, 
it has the rough set approximation (0, U). Hence, r\ - r2 - I and —2" - 2. We thus obtain by definition that 

1 1 1 1 2" - 2 2" - 2 

- -—log- ::;-log- log—:; 

• ^ 2" 2" 2^^ 2" 2" 

2" - 2 

- «-^log(2''-2). 

Whence, (3) holds, finishing the proof of the proposition. □ 

Note that in the clause (3) of Theorem l3.1l if n = 1, the value of the corresponding summand OlogO is taken to be 
0, which is consistent with the limit: 

lim X log X = 0. 
For later need, let us recall the following definition from oHl . 
Definition 3.2. Let {U, n) and {V, cr) be two approximation spaces, and suppose that f : U — > V is a mapping. 

(1) The mapping f is called a homomorphism /rom {U,n) to (V, cr) if for any C & n, there exists D e cr such that 
f(C)QD,wheref(C)^{f(u)\ueC}. 

(2) A homomorphism f is called a monomorphism iff is an injective mapping. 

(3) A tnonomorphism f is called strictly monomorphic if either there exist C & n and D e cr such that f(C) C D, 
namely, f(C) C D and f(C) + D, or \V\ > \U\. 

(4) The mapping f is called an isomorphism ;/ the mapping f ; U — > V is bijective, and moreover, both f and its 
inverse mapping f^^ are homomorphisms. 

We can now state the following facts. 

Proposition 3.1. Let (U, n) and {V, cr) be two approximation spaces with \U\ — \V\, and let f : U — > V be a mapping. 

(1) If f is a monomorphism from {U,n) to {V, cr), in particular, n < cr, then 'H(n) > 'W(cr). 

(2) Iff is a strict monomorphism from {U, n) to {V, cr), in particular, n < cr, then 'H{n) > 'H(cr). 

(3) Iff is an isomorphism from {U,n) to {V, cr), then Tlin) — 'H{cr). 

Proof. It follows immediately from Definition [3T| and Theorem l3.1l □ 

To measure the granularity with respect to the approximation operators app and app carried by the partition n, we 
introduce the concept of co-entropy, which corresponds to the information entropy in Definition 13. II 

Definition 3.3. Keep the notations as in Definition \3.1\ The co-entropy 0{n) of {U, n) (with respect to the approxima- 
tion operators app and app) is defined by 

m 

^(TT) = ^(P(;r)) = 2;^ log r, . (8) 

The quantity Q{ji) furnishes a measure of the average granularity carried by the partition tt as a whole. It follows 
immediately from definition that 

'Hin) + g{n) = n. (9) 

It means that the two measures complement each other with respect to the constant quantity n -\U\, which is invariant 
with respect to the choice of the partition n of U. 

The co-entropy function Q is of the following properties. 
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Theorem 3.2. 



(1) For any n,cr e Yl{U), ifo-<n, then 0{cr) < Q{n). 

(2) The co-entropy function Q reaches the minimum value Ofor the finest partition ft. 

(3) The co-entropy function @ reaches the maximum value log(2" — 2) for the coarsest partition n. 

Proof. All the clauses follow directly from Theorem l3.1l and Eq. (|9). □ 

Similar to Proposition ITT] we have the following observation. 
Proposition 3.2. Let {U,?:) and {U,(r) be two approximation spaces with \U\ — \V\, andlet f : U — > Vbeamapping. 

(1) If f is a monomorphismfrom (U, n) to {V, cr), in particular, n < cr, then Qin) < @{cr). 

(2) If f is a strict monomorphismfrom {U, n) to {V, cr), in particular, n < cr, then Q{n) < @(cr). 

(3) If f is an isomorphism from {U,n) to {V, cr), then @{n) — Q(ct). 

Proof. It follows immediately from Proposition ^. 1 l and Eq. (|9]l. □ 



As a corollary of Theorem 13.21 and Proposition 13 .21 we see that ^ is a partition measure on U in the sense of 
Oil Definition 3.4], that is, @ is nonnegative and satisfies the following two conditions: Q{cr) < Q{n) \f cr < n; 
0{n) - Q{cr) if there exists an isomorphism from {U, n) to (V, cr). 

Note that our information entropy and co-entropy are not directly based on the blocks of a partition. Therefore, in 
general they do not satisfy the definition of expected granularity proposed in 112 Sll . 



4. Relationship of co-entropies between different universes 

In the last section, we have seen that if / is a strict monomorphism from {U, n) to {U, cr), in particular, n < cr, then 
Hin) > 'Hicr) and < Q(cr). In this section, we consider the monotonicities of "TY and Q for different universes. 
In other words, we compare 'Hin) with 'K(cr) and Q{n) with Q{cr) when there exists a strict monomorphism from 
{U,n) to (y, cr), where \V\ > \U\. For convenience, we write (t/, tt) ^ {V,cr) if \V\ > \U\ and there exists a strict 
monomorphism from {U, n) to {V, cr). 

We start with the following observation on the entropy function H and the co-entropy function G reviewed in 
Section 2.2. Consider = {U2,n2) = ({1,2|, {{!}, {2}}), and (f/s.Trj) = <{1, 2, 3}, {{1, 3), {2})). 

Clearly, 

{U\,ni) ^ {U2,n2) ^ {U3,7T3). 

It is easy to check by Definition 12. II that II(n\) - 0, II(n2) = 1, and II{nj,) = log 3 - | < 1. This means that the 
entropy function H is not monotonic. By the way, we can get by a direct computation that 0(Tti) = 0, ^(712) - 0, and 
^(^3) = 3. 

Let us continue to discuss the monotonicity of co-entropy function G. Consider {U\,7i\) - ({!}, {{!})), {1/2,712) - 
<{1,2),{{1,2}}>, and(f/3,7r3) = <{1, 2, 3}, {{1, 2}, {3}}). Again, we see that 

(UuTTl) ^ {U2,7T2) ^ {U3,7T3). 

It is easy to check by Definition 12.21 that G(ni) - 0, GiK2) - 1, and Gin^) - \. This shows that the co-entropy 
function G is not monotonic either. In this case, we can obtain by a direct computation that Q(Tt\) - 0, Q{n2) - \, and 
^(^3) = \. 

Finally, we address the monotonicity of entropy function "H. Consider {U\,ni) - ({1, 2}, {{1, 2})), {1/2,7^2) = 
<{1, 2, 3}, {{1,2}, {3}}>, and <t/3,7r3> = ({1,2, 3,4}, {{1, 2,4}, {3}}). Obviously, we have that 

{Ui,ni) ^ {U2,7l2) ^ {U3,7T3). 

By a routine computation we can get that Hini) - |, 'H{n2) - |, and "Hiji-i) - ^ — \ log 3 < |. Consequently, the 
entropy function 'K is not monotonic either. On the other hand, it follows from Eq. (|9| that @(ni) - 5, Q{n2) - 5, 
and^(7r3) = | + |log3. 
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As a result, in all the above three cases we always have that 

We thus conjecture that 0{7i) < Q{cr) whenever {U, n) '-^ {V, cr). Indeed, it holds true, as we will see later 
To prove the conjecture, it is convenient to introduce the following notion and a key lemma. 

Definition 4.1. Let {U,n) be an approximation space and a i U. The approximation space {UD {a}, ttD {{a]]) is called 
the one-point extension o/(f/, n) by a. We say that {V, cr) is a one-point extension of{U,n) if{V, cr) - {UU{a],nU{{a]]} 
for some a. 

For example, (f/2,7r2> = <{1, 2, 3), {{1, 2}, {3})) is the one-point extension of {Uutti) = ({1, 2}, {{1, 2})) by 3. 
The following lemma shows that one-point extension does not change co-entropy. 

Lemma 4.1. Let (V, cr) be a one-point extension of {U,n}. Then @{cr) — @{n). 

Proof. Suppose that TT = {U\,U2, ■ ■ ■ ,Uk] and cr - {U\,U2, ■ ■ ■ ,UkA<A]7 where a i U; assume that yi(f/,7r) = 
{(A;, a;) 1 1 < / < m) and ^/ = |x c J/ 1 {appX,app„X) = (A,, A;)| with |^,| = r,-. It thus follows that 

:R{V,cr) = :R{U,n) U {(A,- U [a]. A- U {a}) \ l<i< m] . 

For any (B^B'.) € Ji{V,a-), we write S,- for |x Qv\{appJC,'Spp^x) = and Si for If (B„b;) = 

(A,-,A;) e J[{U,n), then we see that S/ - and thus s; = r,- in this case. If (Bi,B'.) - (a,- U {a]. A'. U {a}) € 

{(a,- U {a]. A'. U {fl}) 1 1 < / < m}, then we have that S, - {X U {a]\X e and s; - r, still holds in this case. There- 
fore, we get by Definition 13 . 3 1 that 

m m 

/= 1 /= 1 

m 

= ^(^), 

finishing the proof of the lemma. □ 

For subsequent need, we would like to generalize Definition l4.1l as follows. 

Definition 4.2. Let (U, n) and {V, cr) be two approximation spaces. We say that {V, cr) is a multi-one-point extension 
of {U,n) if there are approximation spaces {Ui,Tii), < i < I, with {Uq,7Tq) — {U,n) and {Ui,ni) — {V,cr) such that 
each {Ui,7Ti), I < i < I, is a one-point extension of {Ui-\,ni^{). 

For example, {V,cr) = ({1, 2, 3,4}, {{1, 2), {3}, {4}}) is a multi-one-point extension of (f/,7r> = ({1, 2}, {{1, 2)}). In 
fact, we may take <t/o,7i-o> = {U,n), <t/i,7ri> = <{1, 2, 3}, {{1, 2), {3})), and(f/2,7r2> = <V,cr). 
The following fact follows immediately from Lemma 14711 

Corollary 4.1. If {V,cr) is a multi-one-point extension of {U,?:), thenQ{cr) — Q{n). 

In light of the above corollary, let us refer to multi-one-point extensions as one-point extensions for simplicity. 
Further, we have the following observation. 

Theorem 4.1. Suppose that there is a monomorphism from {U,n) to {V,cr). If there exists {U',n') that satisfies the 
following two conditions: 

(a) either (V ,7t') — {U,n) or {V ,n') is a one-point extension of {U,n), 

(b) ([/', n') is isomorphic to {V, cr), 
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then 0(n) — Q((t); otherwise, @{n) < @{cr). 

Proof. We first consider the case that there exists {U' ,Ji') that satisfies the conditions (a) and (b). In this case, if 
{{/', n') - {U, Ji) and {V , n') is isomorphic to {V, cr), then |y| = \U\ and we see by Proposition 13 . 21 that ^(tt) - Q{cr). 
If {U',7t') is a one-point extension of {U,7t} and {V ,n') is isomorphic to (V, cr), then we get that Qiji) - 0(7i') by 
Corollary O and g{n') = g{cr) by Proposition [121 Consequently, g(n) = g(cr). 

We now consider the case that there does not exist {V ,7i') such that the conditions are satisfied. It forces that 
the monomorphism, say /, from {U,n) to (V, cr) is strict. Two cases need to consider One is that |y| - \U\. In 
this case, it follows from Proposition 13. 2 1 that g{n) < g{cr). The other case is that \V\ > \U\. In this case, let us set 
(Vi, cTi) - { f{U), f{n)), where f{U) is the image of U under / and f(n) - [ f{U') \ U' e n]. In fact, / gives rise to an 
isomorphism between {U,7t) and {Vi,cri). Therefore, g(n) = ^(crj). Note that Vi - fill) c V. We now take (y2,cr2) 
as follows: 

V2 = V, 0-2^0-1 U {{a} I a e V\Vi] . 

It follows that {V2, (T2) is a one-point extension of (Vi, ctj). Hence, g{(r\) - g{cr2). Because / is a strict monomor- 
phism, we see that {U,n) ^ (V2,o"2) and cr2 <; o". Whence, we get g{cr2) < gicr) by Theorem 13.21 As a result, 
gin) < g{a-). This completes the proof of the theorem. □ 

Let us provide an informal explanation of Theorem 14. II The hypothesis that there is a monomorphism from {U, n) 
to (y, cr) means that {U, n) is finer than (V, cr). In the special case that the monomorphism is not strict, we have that 
{U,n) and (V, cr) are isomorphic, and thus, they have the same co-entropy. If the monomorphism is strict, then after 
renaming the elements of U, we can get a finer partition than (V, cr) by using one-point extensions. Theorem l4.1l savs 
that g{n) < g{cr) if n is finer than cr. 

We end this section with several examples. 

Example 4.1. A trivial example is that (U, n) - ({1,2, 3}, {{1,2}, {3})) and {V, cr) - {{a, b, c], {{a, b], {c}]). The map- 
ping f that maps 1, 2, and 3 to a, b, and c respectively is a monomorphism. In fact, f is an isomorphism. Hence, 
g{n) — g(cr). A direct computation shows that g{n) — 0.5 — gicr). 

Consider {U,n) — ({1, 2}, {{1, 2))) anii (V, cr) — {{a,b,c,d],{la,b],lc],{d]]). The mapping f that maps 1 and 2 to 
a and b respectively is a monomorphism, which yields that (U, n) is isomorphic to {V\ ,cr\) — {[a, b], {{a, b]]). Clearly, 
we can get {V, cr) by one-point extensions of {Vi,cri). Therefore, gin) — gicr). On the other hand, we can get by a 
computation that gin) — 0.5 — gicr). 

Finally, consider {U,n) — {{1,2}, {{1,2}]) and {V,cr) — {la,b,c,d],{{a,b},{c,d}}). As mentioned earlier, the 
mapping f that maps 1 and 2 to a and b respectively is a monomorphism, which gives an isomorphism between {U, n) 
and {Vi,o-i) — {{a,b], {{a,b]]). We can get {V2, 0-2) — {{a, b,c,d], {{a, b},{c},{d]}) by one-point extensions of {Vi,cri). 
Clearly, cr2 < cr. As a result, gin) < gicr). On the other hand, we can obtain by a direct computation that gin) — 0.5 
and gicr) = 0.75. 



5. Conclusion 

In this paper, we have proposed the novel notions of entropy and co-entropy by taking both partitions and the 
lower and upper approximations into account. Some desirable properties of the entropy and co-entropy have been 
presented. Furthermore, we have investigated the relationship of co-entropies between different universes. 

There are several problems which are worth further studying . Firstly, the present work focuses on the classical 
rough sets based on partitions. It would be interesting to generalize the notions of entropy and co-entropy here 
into the framework of covering rough sets or fuzzy rough sets It is also interesting to compare the 

entropies (co-entropies) under some special homomorphisms such as neighborhood-consistent functions introduced 



in 113311 . Secondly, it remains to develop the corresponding roughness measure based on the entropy or co-entropy 
for measuring numerically the roughness of an approximation. Finally, the conditioned entropy and conditioned 
co-entropy H are yet to be addressed in our framework. 
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